Introduction

America. What a place! The food, the freedom, and of course, the free market! In today’s 21st century economy it can be a challenge to get a small business off the ground but luckily there are several mechanisms in place which, if taken advantage of, can help bolster these first time business owners to a profitable workplace. Most local governments encourage small businesses however to get ahead in today’s technological era, it is critical to use the best resources at one’s disposal. In terms of restaurants, a large measure of success can be pointed to public opinion. The more people that have something good to say about your restaurant, the more people will trust they will have a good experience there. Conversely, this works in the other direction; if you do not have excellent food or service these delights could have abysmal ends. How exactly can you access a large survey of the public easily you ask? Let us introduce you to…

Yelp is a business directory service and public review forum which develops, hosts, and markets the Yelp.com website and the Yelp mobile app. On the app, users can rate and review their favorite (or least favorite) businesses to help the community know what they have to offer. A high rating on the app usually leads to more business so it is critical for small buisness owners to get a high rating from as many people as possible.

The Yelp Challenge

Every year, Yelp provides a subset of their data in a competition titled “The Yelp Open Dataset Challenge”. Here they provide data from 1,968,703 users on over 1.4 million business attributes like hours, parking, availability, and ambience. This is an incredibly large dataset so we took it upon ourselves to dive in and see what potential avenues we could take with the project. While this challenge provides a plethora of information on users and reviews, our focus for this analysis will be on businesses, their attributes and their overall ratings.

With this initial breakdown of states, we see that there are only about 11 states represented in this dataset with more than 500 business observations reported.

city count
Las Vegas 29370
Toronto 18906
Phoenix 18766
Charlotte 9509
Scottsdale 8837
Calgary 7736
Pittsburgh 7017
Montréal 6449
Mesa 6080
Henderson 4892

The top ten cities in this dataset helped us narrow our focus for our analysis to the Entertainment Capital of the World, LAS VEGAS.

Viva Las Vegas!

The U.S. map shows that our dataset only includes businesses in a select group of cities. Since Las Vegas, Nevada has the highest count of businesses, we decided to restrict our analysis to just Las Vegas and its surrounding suburbs. It is clear from the map of Nevada that all of the data for the state are clustered around Las Vegas, so we do not need to worry about other parts of the state. An analysis of just this area is incredibly useful due to the booming tourism industry there; with many people coming from out of town to visit, having high ratings on Yelp can ensure foot-traffic coming to a buisness’s doorstep.

## Reading layer `City_and_County_Limits_Shaded_AdministrativeBoundaries' from data source `C:\Users\johnt\Box\Unicorn\Vegas_shapefile\City_and_County_Limits_Shaded_AdministrativeBoundaries.shp' using driver `ESRI Shapefile'
## Simple feature collection with 7 features and 6 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 564000 ymin: 26300000 xmax: 1110000 ymax: 2.7e+07
## proj4string:    +proj=tmerc +lat_0=34.75 +lon_0=-115.5833333333333 +k=0.9999 +x_0=200000 +y_0=7999999.999999999 +datum=NAD83 +units=us-ft +no_defs

The following map shows the geographic distribution of businesses based on the number of reviews they have. The bigger, yellow bubbles represent businesses with a large number of reviews, while the smaller, purple bubbles represent businesses with a small number of reviews. It is not surprising that the businesses with the most reviews are concentrated in the center of the map - this area is close to downtown and the Las Vegas Strip, which is a commercial hub and very popular with tourists.

The following four histograms show the frequency distribution of how many reviews the businesses have. The first histogram shows the distribution of all the data:

The data are extremely right-skewed, which means that the vast majority of businesses have few reviews but there are outliers with many reviews - up to approximately 8,000. These outlier businesses are likely popular tourist spots and are concentrated downtown or around The Strip, as we can see from the map. Since the data are so heavily skewed, the histogram makes it difficult to visualize how the data are distributed. Here is a histogram that shows the distribution of businesses with only 2000 reviews or fewer…

…and only 250 or fewer…

…and only 50 or fewer:

This QQ-plot provides even more visual evidence of the fact that the number of reviews is not normally distributed:

In addition to the number of reviews, we wanted to see if there are any geographic patterns in how highly businesses are rated. In the following map, the darker reddish-colored bubbles represent businesses with high ratings, while the lighter yellow-colored bubbles represent businesses with low ratings.

It appears that highly and poorly rated businesses are fairly equally dispersed throughout the Las Vegas area. Although businesses downtown tend to have more ratings, they do not necessarily have higher ratings compared to businesses with less central locations. This histogram shows the frequency distribution of businesses’ ratings:

It is evident that the ratings data are left-skewed, meaning high ratings are more common than low ratings. Since neither the number of reviews nor the ratings are normally distributed, in order to observe the correlation between the two variables, we calculated the Spearman coefficient:

## 
##  Spearman's rank correlation rho
## 
## data:  NV_business$stars and NV_business$review_count
## S = 4e+12, p-value = 1e-04
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
##     rho 
## -0.0226

The p-value is close to zero, which indicates there is a statistically significant correlation between the two variables. However, the magnitude of the coefficient (-0.023) is so small that this correlation is negligible. Therefore we cannot make any strong conclusions about the relationship between ratings and the number of reviews.

Our last map explores the geographic distribution of businesses with different levels of alcohol served. We thought this would be interesting because Las Vegas - especially The Strip - is known for partying and drinking. Since not all businesses in the dataset are restaurants or bars, we eliminated any NA values for the ‘Alcohol’ variable. Sure enough, the map shows that businesses with a full bar are generally concentrated downtown, near The Strip, while establishments that do not serve alchohol or only serve beer and wine are spread out throughout the city. The histogram legend shows that the plurality of businesses (with non-NA Alcohol values) do not serve any alcohol, and there are more businesses with a full bar than businesses that only serve beer and wine. It would be interesting to compare these results to another city that is not known for drinking and partying.

Data Cleaning

The Variables

Note: Converting the dataframe to data table to flatten the inner indexed columns
Below are the variables in the data set:

##  [1] "business_id"                          
##  [2] "name"                                 
##  [3] "address"                              
##  [4] "city"                                 
##  [5] "state"                                
##  [6] "postal_code"                          
##  [7] "latitude"                             
##  [8] "longitude"                            
##  [9] "stars"                                
## [10] "review_count"                         
## [11] "is_open"                              
## [12] "attributes.GoodForKids"               
## [13] "attributes.RestaurantsReservations"   
## [14] "attributes.GoodForMeal"               
## [15] "attributes.BusinessParking"           
## [16] "attributes.Caters"                    
## [17] "attributes.NoiseLevel"                
## [18] "attributes.RestaurantsTableService"   
## [19] "attributes.RestaurantsTakeOut"        
## [20] "attributes.RestaurantsPriceRange2"    
## [21] "attributes.OutdoorSeating"            
## [22] "attributes.BikeParking"               
## [23] "attributes.Ambience"                  
## [24] "attributes.HasTV"                     
## [25] "attributes.WiFi"                      
## [26] "attributes.Alcohol"                   
## [27] "attributes.RestaurantsAttire"         
## [28] "attributes.RestaurantsGoodForGroups"  
## [29] "attributes.RestaurantsDelivery"       
## [30] "attributes.BusinessAcceptsCreditCards"
## [31] "attributes.BusinessAcceptsBitcoin"    
## [32] "attributes.ByAppointmentOnly"         
## [33] "attributes.AcceptsInsurance"          
## [34] "attributes.Music"                     
## [35] "attributes.GoodForDancing"            
## [36] "attributes.CoatCheck"                 
## [37] "attributes.HappyHour"                 
## [38] "attributes.BestNights"                
## [39] "attributes.WheelchairAccessible"      
## [40] "attributes.DogsAllowed"               
## [41] "attributes.BYOBCorkage"               
## [42] "attributes.DriveThru"                 
## [43] "attributes.Smoking"                   
## [44] "attributes.AgesAllowed"               
## [45] "attributes.HairSpecializesIn"         
## [46] "attributes.Corkage"                   
## [47] "attributes.BYOB"                      
## [48] "attributes.DietaryRestrictions"       
## [49] "attributes.Open24Hours"               
## [50] "attributes.RestaurantsCounterService" 
## [51] "categories"                           
## [52] "hours.Monday"                         
## [53] "hours.Tuesday"                        
## [54] "hours.Wednesday"                      
## [55] "hours.Thursday"                       
## [56] "hours.Friday"                         
## [57] "hours.Saturday"                       
## [58] "hours.Sunday"

Now we want to remove the non-relevant variables.

Below is the percentage missing values in every column:

##                                       percent_na
## business_id                                0.000
## name                                       0.000
## address                                    0.000
## city                                       0.000
## state                                      0.000
## postal_code                                0.000
## latitude                                   0.000
## longitude                                  0.000
## stars                                      0.000
## review_count                               0.000
## is_open                                    0.000
## attributes.GoodForKids                    70.737
## attributes.RestaurantsReservations        79.627
## attributes.GoodForMeal                    87.095
## attributes.BusinessParking                50.881
## attributes.Caters                         81.904
## attributes.NoiseLevel                     82.022
## attributes.RestaurantsTableService        92.476
## attributes.RestaurantsTakeOut             76.192
## attributes.RestaurantsPriceRange2         50.507
## attributes.OutdoorSeating                 77.426
## attributes.BikeParking                    57.774
## attributes.Ambience                       80.188
## attributes.HasTV                          80.040
## attributes.WiFi                           79.045
## attributes.Alcohol                        79.379
## attributes.RestaurantsAttire              81.232
## attributes.RestaurantsGoodForGroups       77.837
## attributes.RestaurantsDelivery            79.811
## attributes.BusinessAcceptsCreditCards     21.244
## attributes.BusinessAcceptsBitcoin         89.375
## attributes.ByAppointmentOnly              69.363
## attributes.AcceptsInsurance               94.506
## attributes.Music                          97.122
## attributes.GoodForDancing                 97.439
## attributes.CoatCheck                      98.147
## attributes.HappyHour                      97.246
## attributes.BestNights                     97.948
## attributes.WheelchairAccessible           87.464
## attributes.DogsAllowed                    96.120
## attributes.BYOBCorkage                    98.937
## attributes.DriveThru                      98.237
## attributes.Smoking                        98.185
## attributes.AgesAllowed                    99.854
## attributes.HairSpecializesIn              99.212
## attributes.Corkage                        99.320
## attributes.BYOB                           99.967
## attributes.DietaryRestrictions            99.983
## attributes.Open24Hours                    99.986
## attributes.RestaurantsCounterService      99.989
## categories                                 0.273
## hours.Monday                              26.165
## hours.Tuesday                             24.312
## hours.Wednesday                           23.752
## hours.Thursday                            23.403
## hours.Friday                              23.741
## hours.Saturday                            36.426
## hours.Sunday                              52.644

Columns with percentage NA values > 95% were removed i.e., 16 columns.
Hence, after cleaning, final data set has 42 columns listed below:

##  [1] "business_id"                          
##  [2] "name"                                 
##  [3] "address"                              
##  [4] "city"                                 
##  [5] "state"                                
##  [6] "postal_code"                          
##  [7] "latitude"                             
##  [8] "longitude"                            
##  [9] "stars"                                
## [10] "review_count"                         
## [11] "is_open"                              
## [12] "attributes.GoodForKids"               
## [13] "attributes.RestaurantsReservations"   
## [14] "attributes.GoodForMeal"               
## [15] "attributes.BusinessParking"           
## [16] "attributes.Caters"                    
## [17] "attributes.NoiseLevel"                
## [18] "attributes.RestaurantsTableService"   
## [19] "attributes.RestaurantsTakeOut"        
## [20] "attributes.RestaurantsPriceRange2"    
## [21] "attributes.OutdoorSeating"            
## [22] "attributes.BikeParking"               
## [23] "attributes.Ambience"                  
## [24] "attributes.HasTV"                     
## [25] "attributes.WiFi"                      
## [26] "attributes.Alcohol"                   
## [27] "attributes.RestaurantsAttire"         
## [28] "attributes.RestaurantsGoodForGroups"  
## [29] "attributes.RestaurantsDelivery"       
## [30] "attributes.BusinessAcceptsCreditCards"
## [31] "attributes.BusinessAcceptsBitcoin"    
## [32] "attributes.ByAppointmentOnly"         
## [33] "attributes.AcceptsInsurance"          
## [34] "attributes.WheelchairAccessible"      
## [35] "categories"                           
## [36] "hours.Monday"                         
## [37] "hours.Tuesday"                        
## [38] "hours.Wednesday"                      
## [39] "hours.Thursday"                       
## [40] "hours.Friday"                         
## [41] "hours.Saturday"                       
## [42] "hours.Sunday"

Note: Data type of some of the variables has been converted to categorical.

Structuring the Dataset

Following are the various business categories:

Major business categories







Structuring the Dataset into Multiple Business Categories

Data set has been segmented into six major categories: * Automotive * Food Bar Casinos * Medical * Real Estate Financial Advisory * Personal Care * Travel

EDA for Business Categories

## Error in get(genname, envir = envir) : object 'vec_proxy' not found
## Error in get(genname, envir = envir) : object 'vec_ptype2' not found
## # A tibble: 6 x 2
##   Type                           star_ratings
##   <chr>                                 <dbl>
## 1 automotive                             3.82
## 2 food_bar_casinos                       3.50
## 3 medical                                3.78
## 4 personal_care                          4.04
## 5 real_estate_financial_advisory         3.44
## 6 travel                                 3.37
## # A tibble: 6 x 2
##   Type                               n
##   <chr>                          <int>
## 1 automotive                      3787
## 2 food_bar_casinos               10727
## 3 medical                         4223
## 4 personal_care                   3292
## 5 real_estate_financial_advisory  2677
## 6 travel                          1468

ANOVA for Star Ratings among the Business Categories

##                Df Sum Sq Mean Sq F value Pr(>F)    
## Type            5   1143     228     230 <2e-16 ***
## Residuals   26168  25989       1                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = stars ~ Type, data = major_types)
## 
## $Type
##                                                    diff     lwr      upr
## food_bar_casinos-automotive                     -0.3138 -0.3674 -0.26007
## medical-automotive                              -0.0356 -0.0992  0.02792
## personal_care-automotive                         0.2192  0.1516  0.28692
## real_estate_financial_advisory-automotive       -0.3825 -0.4542 -0.31077
## travel-automotive                               -0.4508 -0.5382 -0.36353
## medical-food_bar_casinos                         0.2781  0.2265  0.32970
## personal_care-food_bar_casinos                   0.5330  0.4764  0.58958
## real_estate_financial_advisory-food_bar_casinos -0.0687 -0.1301 -0.00737
## travel-food_bar_casinos                         -0.1371 -0.2161 -0.05806
## personal_care-medical                            0.2549  0.1889  0.32091
## real_estate_financial_advisory-medical          -0.3468 -0.4170 -0.27668
## travel-medical                                  -0.4152 -0.5012 -0.32916
## real_estate_financial_advisory-personal_care    -0.6017 -0.6756 -0.52781
## travel-personal_care                            -0.6701 -0.7592 -0.58095
## travel-real_estate_financial_advisory           -0.0684 -0.1606  0.02387
##                                                 p adj
## food_bar_casinos-automotive                     0.000
## medical-automotive                              0.600
## personal_care-automotive                        0.000
## real_estate_financial_advisory-automotive       0.000
## travel-automotive                               0.000
## medical-food_bar_casinos                        0.000
## personal_care-food_bar_casinos                  0.000
## real_estate_financial_advisory-food_bar_casinos 0.018
## travel-food_bar_casinos                         0.000
## personal_care-medical                           0.000
## real_estate_financial_advisory-medical          0.000
## travel-medical                                  0.000
## real_estate_financial_advisory-personal_care    0.000
## travel-personal_care                            0.000
## travel-real_estate_financial_advisory           0.281

As per the ANOVA test, since p-value is less than 0.05, hence we reject the Null Hypothesis that all businesses have similar average star ratings. When we followed this by Post Hoc Tukey HSD test, it emphasized on the variance in the star ratings.

Therefore, we conclude that different business categories have different avg. star ratings.

Business Categories with respect to Star Ratings

Among all the businesses, food, bar and casinos has the least Inter Quartile Range, which means 50% of the star ratings are more concentrated near the median as compared to the other businesses.

Star Ratings Distribution for Business Category

Among all the businesses, food, bar and casinos has the distribution more closer to normal distrinution as compared to the other businesses.

Therefore, we would be building the model for the food, bar and casinos business.

Preliminary Model Building

Now that we have done a thorough exploratory data analysis, we want to move closer to finding our which variables have the strongest impact on rating (This is measured with their ‘stars’(Factor0-5)). Since most variables are stored as factor, we did not do a correlation plot and opted to look at the significances of each variable. Our first question is “does review (review_count) has a impact on stars rating???” Therefore, we build our first model:

## 
## Call:
## lm(formula = stars ~ review, data = funday)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.4922 -0.4896  0.0185  0.5198  1.5244 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 3.47e+00   9.44e-03     368   <2e-16 ***
## review      3.26e-04   2.51e-05      13   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.824 on 8931 degrees of freedom
## Multiple R-squared:  0.0186, Adjusted R-squared:  0.0185 
## F-statistic:  169 on 1 and 8931 DF,  p-value: <2e-16

The effect of coefficient values for review on this model is positive. The p-value for both Intercept and review are the same and significant. The multiple R-squared value are almost the same as the adjusted R-squared vaulue in the variables.

## 
## Call:
## lm(formula = stars ~ review + good4groups + alcohol, data = funday)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5210 -0.5221  0.0454  0.5054  1.9059 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              3.32e+00   5.83e-02   56.94  < 2e-16 ***
## review                   2.72e-04   2.35e-05   11.57  < 2e-16 ***
## good4groupsTrue          3.34e-01   4.03e-02    8.27  < 2e-16 ***
## alcohol'full_bar'       -9.53e-02   5.07e-02   -1.88   0.0601 .  
## alcohol'none'           -2.61e-01   4.93e-02   -5.30  1.2e-07 ***
## alcoholNone              3.31e-01   5.17e-01    0.64   0.5221    
## alcoholu'beer_and_wine' -4.18e-02   5.28e-02   -0.79   0.4287    
## alcoholu'full_bar'      -1.37e-01   4.62e-02   -2.96   0.0031 ** 
## alcoholu'none'          -2.41e-01   4.62e-02   -5.21  2.0e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.729 on 5867 degrees of freedom
##   (3057 observations deleted due to missingness)
## Multiple R-squared:  0.0561, Adjusted R-squared:  0.0548 
## F-statistic: 43.6 on 8 and 5867 DF,  p-value: <2e-16

In this next model, we add two factor variables good4groups and alcohol into it. The effect of coefficient values for review on this model is positive. The p-value for Intercept, review and good4groups are the small which is positive in statistic. However, We can see the variablealcohol is insignificant, so we exclude it in the next model.

## 
## Call:
## lm(formula = stars ~ review + good4groups + outdoorseat, data = funday)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.6053 -0.4480  0.0581  0.5460  1.9210 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      3.07e+00   3.69e-02   83.31   <2e-16 ***
## review           3.05e-04   2.34e-05   13.06   <2e-16 ***
## good4groupsTrue  3.50e-01   3.79e-02    9.24   <2e-16 ***
## outdoorseatNone -2.23e-01   3.70e-01   -0.60     0.55    
## outdoorseatTrue  1.78e-01   2.12e-02    8.39   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.74 on 6203 degrees of freedom
##   (2725 observations deleted due to missingness)
## Multiple R-squared:  0.0557, Adjusted R-squared:  0.0551 
## F-statistic: 91.5 on 4 and 6203 DF,  p-value: <2e-16

Now, we add outdoorseat into this model, the effect of coefficient values for review on this model is positive. The coefficient for outdoorseatTrue indicates the model has more starts than outdoorseatNone while the varibale review and good4groupTrue doesn not change. The p-value for Intercept, review, good4groups and outdoorseatTrue are the small which is positive in statistic. Therefore, the restaurants which have outdoorseating do get BETTER stars.

## 
## Call:
## lm(formula = stars ~ review + good4groups + outdoorseat + pricerange, 
##     data = funday)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5318 -0.4939  0.0154  0.5111  1.9542 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      3.01e+00   3.81e-02   79.00  < 2e-16 ***
## review           2.72e-04   2.38e-05   11.43  < 2e-16 ***
## good4groupsTrue  3.37e-01   3.82e-02    8.83  < 2e-16 ***
## outdoorseatNone -2.82e-01   3.67e-01   -0.77   0.4426    
## outdoorseatTrue  1.80e-01   2.11e-02    8.52  < 2e-16 ***
## pricerange2      1.32e-01   1.99e-02    6.65  3.1e-11 ***
## pricerange3      1.07e-01   4.10e-02    2.62   0.0089 ** 
## pricerange4      3.08e-01   6.81e-02    4.52  6.4e-06 ***
## pricerangeNone   3.04e-01   5.19e-01    0.59   0.5577    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.734 on 6147 degrees of freedom
##   (2777 observations deleted due to missingness)
## Multiple R-squared:  0.0649, Adjusted R-squared:  0.0637 
## F-statistic: 53.4 on 8 and 6147 DF,  p-value: <2e-16

we add pricerange into the model4. From this model, we can probably figure out if the different levels of pricrange have impact on stars. These models lead us to more future study…

Note: The categories within alcohol have been reduced to ‘beer and wine’, ‘full bar’ and ‘none’.

Feature Selection

We used the LEAPS package in R to perform feature selection in order to determine the most appropriate variables for predicting star ratings, which we could then include in our final model. In order to do this, we created a subset of the entire dataset which only included variables that did not have linear dependencies and were appropriately coded as factor variables. Some of the measures of cross-validation we will be looking at are R-squared, adjusted R-squared, Cp and BIC.

The plots below illustrate each of these measures, visually depicting which variables are the best fit for predicting ratings in the given dataset.

## Reordering variables and trying again:

For R-squared and adjusted R-squared, higher values are indicative of more accurate models since these measure the variation in the dependent variable (ratings) which is explained by the independent variables (restaurant attributes). In this particular subset, the highest values we are able to achieve are 12% for R-squared and 11% for adjusted R-squared. We can see which models correspond to these values in the graphs below.

plot(reg1, scale = "r2", main = "R^2")

plot(reg1, scale = "adjr2", main = "Adjusted R^2")

For Cp and BIC on the other hand, smaller values are indicative of better models. The best fit model corresponding to the lowest Cp has a value of 9.2 while for BIC we can choose between the six models that have a value of -160.

## Subset selection object
## Call: regsubsets.formula(stars ~ attributes.NoiseLevel + attributes.RestaurantsTableService + 
##     attributes.RestaurantsTakeOut + attributes.RestaurantsPriceRange2 + 
##     attributes.OutdoorSeating + attributes.WiFi + attributes.Alcohol + 
##     attributes.RestaurantsAttire, data = food_data, nvmax = 10)
## 17 Variables  (and intercept)
##                                        Forced in Forced out
## attributes.NoiseLevel'loud'                FALSE      FALSE
## attributes.NoiseLevel'quiet'               FALSE      FALSE
## attributes.NoiseLevel'very_loud'           FALSE      FALSE
## attributes.RestaurantsTableServiceTrue     FALSE      FALSE
## attributes.RestaurantsTakeOutTrue          FALSE      FALSE
## attributes.RestaurantsPriceRange22         FALSE      FALSE
## attributes.RestaurantsPriceRange23         FALSE      FALSE
## attributes.RestaurantsPriceRange24         FALSE      FALSE
## attributes.OutdoorSeatingTrue              FALSE      FALSE
## attributes.WiFi'no'                        FALSE      FALSE
## attributes.WiFi'paid'                      FALSE      FALSE
## attributes.Alcohol'full_bar'               FALSE      FALSE
## attributes.Alcohol'none'                   FALSE      FALSE
## attributes.RestaurantsAttire'dressy'       FALSE      FALSE
## attributes.RestaurantsPriceRange2None      FALSE      FALSE
## attributes.RestaurantsAttire'formal'       FALSE      FALSE
## attributes.RestaurantsAttireNone           FALSE      FALSE
## 1 subsets of each size up to 11
## Selection Algorithm: exhaustive
##           attributes.NoiseLevel'loud' attributes.NoiseLevel'quiet'
## 1  ( 1 )  " "                         " "                         
## 2  ( 1 )  "*"                         " "                         
## 3  ( 1 )  "*"                         "*"                         
## 4  ( 1 )  "*"                         "*"                         
## 5  ( 1 )  "*"                         "*"                         
## 6  ( 1 )  "*"                         "*"                         
## 7  ( 1 )  "*"                         "*"                         
## 8  ( 1 )  "*"                         "*"                         
## 9  ( 1 )  "*"                         "*"                         
## 10  ( 1 ) "*"                         "*"                         
## 11  ( 1 ) "*"                         "*"                         
##           attributes.NoiseLevel'very_loud'
## 1  ( 1 )  " "                             
## 2  ( 1 )  " "                             
## 3  ( 1 )  " "                             
## 4  ( 1 )  " "                             
## 5  ( 1 )  "*"                             
## 6  ( 1 )  "*"                             
## 7  ( 1 )  "*"                             
## 8  ( 1 )  "*"                             
## 9  ( 1 )  "*"                             
## 10  ( 1 ) "*"                             
## 11  ( 1 ) "*"                             
##           attributes.RestaurantsTableServiceTrue
## 1  ( 1 )  "*"                                   
## 2  ( 1 )  "*"                                   
## 3  ( 1 )  "*"                                   
## 4  ( 1 )  "*"                                   
## 5  ( 1 )  "*"                                   
## 6  ( 1 )  "*"                                   
## 7  ( 1 )  "*"                                   
## 8  ( 1 )  "*"                                   
## 9  ( 1 )  "*"                                   
## 10  ( 1 ) "*"                                   
## 11  ( 1 ) "*"                                   
##           attributes.RestaurantsTakeOutTrue
## 1  ( 1 )  " "                              
## 2  ( 1 )  " "                              
## 3  ( 1 )  " "                              
## 4  ( 1 )  " "                              
## 5  ( 1 )  " "                              
## 6  ( 1 )  " "                              
## 7  ( 1 )  " "                              
## 8  ( 1 )  " "                              
## 9  ( 1 )  " "                              
## 10  ( 1 ) " "                              
## 11  ( 1 ) "*"                              
##           attributes.RestaurantsPriceRange22
## 1  ( 1 )  " "                               
## 2  ( 1 )  " "                               
## 3  ( 1 )  " "                               
## 4  ( 1 )  " "                               
## 5  ( 1 )  " "                               
## 6  ( 1 )  " "                               
## 7  ( 1 )  " "                               
## 8  ( 1 )  " "                               
## 9  ( 1 )  " "                               
## 10  ( 1 ) "*"                               
## 11  ( 1 ) "*"                               
##           attributes.RestaurantsPriceRange23
## 1  ( 1 )  " "                               
## 2  ( 1 )  " "                               
## 3  ( 1 )  " "                               
## 4  ( 1 )  " "                               
## 5  ( 1 )  " "                               
## 6  ( 1 )  " "                               
## 7  ( 1 )  " "                               
## 8  ( 1 )  " "                               
## 9  ( 1 )  "*"                               
## 10  ( 1 ) "*"                               
## 11  ( 1 ) "*"                               
##           attributes.RestaurantsPriceRange24
## 1  ( 1 )  " "                               
## 2  ( 1 )  " "                               
## 3  ( 1 )  " "                               
## 4  ( 1 )  " "                               
## 5  ( 1 )  " "                               
## 6  ( 1 )  " "                               
## 7  ( 1 )  " "                               
## 8  ( 1 )  "*"                               
## 9  ( 1 )  "*"                               
## 10  ( 1 ) "*"                               
## 11  ( 1 ) "*"                               
##           attributes.RestaurantsPriceRange2None
## 1  ( 1 )  " "                                  
## 2  ( 1 )  " "                                  
## 3  ( 1 )  " "                                  
## 4  ( 1 )  " "                                  
## 5  ( 1 )  " "                                  
## 6  ( 1 )  " "                                  
## 7  ( 1 )  " "                                  
## 8  ( 1 )  " "                                  
## 9  ( 1 )  " "                                  
## 10  ( 1 ) " "                                  
## 11  ( 1 ) " "                                  
##           attributes.OutdoorSeatingTrue attributes.WiFi'no'
## 1  ( 1 )  " "                           " "                
## 2  ( 1 )  " "                           " "                
## 3  ( 1 )  " "                           " "                
## 4  ( 1 )  "*"                           " "                
## 5  ( 1 )  "*"                           " "                
## 6  ( 1 )  "*"                           "*"                
## 7  ( 1 )  "*"                           "*"                
## 8  ( 1 )  "*"                           "*"                
## 9  ( 1 )  "*"                           "*"                
## 10  ( 1 ) "*"                           "*"                
## 11  ( 1 ) "*"                           "*"                
##           attributes.WiFi'paid' attributes.Alcohol'full_bar'
## 1  ( 1 )  " "                   " "                         
## 2  ( 1 )  " "                   " "                         
## 3  ( 1 )  " "                   " "                         
## 4  ( 1 )  " "                   " "                         
## 5  ( 1 )  " "                   " "                         
## 6  ( 1 )  " "                   " "                         
## 7  ( 1 )  " "                   "*"                         
## 8  ( 1 )  " "                   "*"                         
## 9  ( 1 )  " "                   "*"                         
## 10  ( 1 ) " "                   "*"                         
## 11  ( 1 ) " "                   "*"                         
##           attributes.Alcohol'none' attributes.RestaurantsAttire'dressy'
## 1  ( 1 )  " "                      " "                                 
## 2  ( 1 )  " "                      " "                                 
## 3  ( 1 )  " "                      " "                                 
## 4  ( 1 )  " "                      " "                                 
## 5  ( 1 )  " "                      " "                                 
## 6  ( 1 )  " "                      " "                                 
## 7  ( 1 )  " "                      " "                                 
## 8  ( 1 )  " "                      " "                                 
## 9  ( 1 )  " "                      " "                                 
## 10  ( 1 ) " "                      " "                                 
## 11  ( 1 ) " "                      " "                                 
##           attributes.RestaurantsAttire'formal'
## 1  ( 1 )  " "                                 
## 2  ( 1 )  " "                                 
## 3  ( 1 )  " "                                 
## 4  ( 1 )  " "                                 
## 5  ( 1 )  " "                                 
## 6  ( 1 )  " "                                 
## 7  ( 1 )  " "                                 
## 8  ( 1 )  " "                                 
## 9  ( 1 )  " "                                 
## 10  ( 1 ) " "                                 
## 11  ( 1 ) " "                                 
##           attributes.RestaurantsAttireNone
## 1  ( 1 )  " "                             
## 2  ( 1 )  " "                             
## 3  ( 1 )  " "                             
## 4  ( 1 )  " "                             
## 5  ( 1 )  " "                             
## 6  ( 1 )  " "                             
## 7  ( 1 )  " "                             
## 8  ( 1 )  " "                             
## 9  ( 1 )  " "                             
## 10  ( 1 ) " "                             
## 11  ( 1 ) " "

The plots below depict the results of performing feature selection using backward and forward selection. There appear to be no discernible differences between the two methods; both result in the same variables for the best fit model.

## Reordering variables and trying again:

## Subset selection object
## Call: regsubsets.formula(stars ~ attributes.NoiseLevel + attributes.RestaurantsTableService + 
##     attributes.RestaurantsTakeOut + attributes.RestaurantsPriceRange2 + 
##     attributes.OutdoorSeating + attributes.WiFi + attributes.Alcohol + 
##     attributes.RestaurantsAttire, data = food_data, method = "backward")
## 17 Variables  (and intercept)
##                                        Forced in Forced out
## attributes.NoiseLevel'loud'                FALSE      FALSE
## attributes.NoiseLevel'quiet'               FALSE      FALSE
## attributes.NoiseLevel'very_loud'           FALSE      FALSE
## attributes.RestaurantsTableServiceTrue     FALSE      FALSE
## attributes.RestaurantsTakeOutTrue          FALSE      FALSE
## attributes.RestaurantsPriceRange22         FALSE      FALSE
## attributes.RestaurantsPriceRange23         FALSE      FALSE
## attributes.RestaurantsPriceRange24         FALSE      FALSE
## attributes.OutdoorSeatingTrue              FALSE      FALSE
## attributes.WiFi'no'                        FALSE      FALSE
## attributes.WiFi'paid'                      FALSE      FALSE
## attributes.Alcohol'full_bar'               FALSE      FALSE
## attributes.Alcohol'none'                   FALSE      FALSE
## attributes.RestaurantsAttire'dressy'       FALSE      FALSE
## attributes.RestaurantsPriceRange2None      FALSE      FALSE
## attributes.RestaurantsAttire'formal'       FALSE      FALSE
## attributes.RestaurantsAttireNone           FALSE      FALSE
## 1 subsets of each size up to 9
## Selection Algorithm: backward
##          attributes.NoiseLevel'loud' attributes.NoiseLevel'quiet'
## 1  ( 1 ) " "                         " "                         
## 2  ( 1 ) "*"                         " "                         
## 3  ( 1 ) "*"                         "*"                         
## 4  ( 1 ) "*"                         "*"                         
## 5  ( 1 ) "*"                         "*"                         
## 6  ( 1 ) "*"                         "*"                         
## 7  ( 1 ) "*"                         "*"                         
## 8  ( 1 ) "*"                         "*"                         
## 9  ( 1 ) "*"                         "*"                         
##          attributes.NoiseLevel'very_loud'
## 1  ( 1 ) " "                             
## 2  ( 1 ) " "                             
## 3  ( 1 ) " "                             
## 4  ( 1 ) " "                             
## 5  ( 1 ) "*"                             
## 6  ( 1 ) "*"                             
## 7  ( 1 ) "*"                             
## 8  ( 1 ) "*"                             
## 9  ( 1 ) "*"                             
##          attributes.RestaurantsTableServiceTrue
## 1  ( 1 ) "*"                                   
## 2  ( 1 ) "*"                                   
## 3  ( 1 ) "*"                                   
## 4  ( 1 ) "*"                                   
## 5  ( 1 ) "*"                                   
## 6  ( 1 ) "*"                                   
## 7  ( 1 ) "*"                                   
## 8  ( 1 ) "*"                                   
## 9  ( 1 ) "*"                                   
##          attributes.RestaurantsTakeOutTrue
## 1  ( 1 ) " "                              
## 2  ( 1 ) " "                              
## 3  ( 1 ) " "                              
## 4  ( 1 ) " "                              
## 5  ( 1 ) " "                              
## 6  ( 1 ) " "                              
## 7  ( 1 ) " "                              
## 8  ( 1 ) " "                              
## 9  ( 1 ) " "                              
##          attributes.RestaurantsPriceRange22
## 1  ( 1 ) " "                               
## 2  ( 1 ) " "                               
## 3  ( 1 ) " "                               
## 4  ( 1 ) " "                               
## 5  ( 1 ) " "                               
## 6  ( 1 ) " "                               
## 7  ( 1 ) " "                               
## 8  ( 1 ) " "                               
## 9  ( 1 ) " "                               
##          attributes.RestaurantsPriceRange23
## 1  ( 1 ) " "                               
## 2  ( 1 ) " "                               
## 3  ( 1 ) " "                               
## 4  ( 1 ) " "                               
## 5  ( 1 ) " "                               
## 6  ( 1 ) " "                               
## 7  ( 1 ) " "                               
## 8  ( 1 ) " "                               
## 9  ( 1 ) "*"                               
##          attributes.RestaurantsPriceRange24
## 1  ( 1 ) " "                               
## 2  ( 1 ) " "                               
## 3  ( 1 ) " "                               
## 4  ( 1 ) " "                               
## 5  ( 1 ) " "                               
## 6  ( 1 ) " "                               
## 7  ( 1 ) " "                               
## 8  ( 1 ) "*"                               
## 9  ( 1 ) "*"                               
##          attributes.RestaurantsPriceRange2None
## 1  ( 1 ) " "                                  
## 2  ( 1 ) " "                                  
## 3  ( 1 ) " "                                  
## 4  ( 1 ) " "                                  
## 5  ( 1 ) " "                                  
## 6  ( 1 ) " "                                  
## 7  ( 1 ) " "                                  
## 8  ( 1 ) " "                                  
## 9  ( 1 ) " "                                  
##          attributes.OutdoorSeatingTrue attributes.WiFi'no'
## 1  ( 1 ) " "                           " "                
## 2  ( 1 ) " "                           " "                
## 3  ( 1 ) " "                           " "                
## 4  ( 1 ) "*"                           " "                
## 5  ( 1 ) "*"                           " "                
## 6  ( 1 ) "*"                           "*"                
## 7  ( 1 ) "*"                           "*"                
## 8  ( 1 ) "*"                           "*"                
## 9  ( 1 ) "*"                           "*"                
##          attributes.WiFi'paid' attributes.Alcohol'full_bar'
## 1  ( 1 ) " "                   " "                         
## 2  ( 1 ) " "                   " "                         
## 3  ( 1 ) " "                   " "                         
## 4  ( 1 ) " "                   " "                         
## 5  ( 1 ) " "                   " "                         
## 6  ( 1 ) " "                   " "                         
## 7  ( 1 ) " "                   "*"                         
## 8  ( 1 ) " "                   "*"                         
## 9  ( 1 ) " "                   "*"                         
##          attributes.Alcohol'none' attributes.RestaurantsAttire'dressy'
## 1  ( 1 ) " "                      " "                                 
## 2  ( 1 ) " "                      " "                                 
## 3  ( 1 ) " "                      " "                                 
## 4  ( 1 ) " "                      " "                                 
## 5  ( 1 ) " "                      " "                                 
## 6  ( 1 ) " "                      " "                                 
## 7  ( 1 ) " "                      " "                                 
## 8  ( 1 ) " "                      " "                                 
## 9  ( 1 ) " "                      " "                                 
##          attributes.RestaurantsAttire'formal'
## 1  ( 1 ) " "                                 
## 2  ( 1 ) " "                                 
## 3  ( 1 ) " "                                 
## 4  ( 1 ) " "                                 
## 5  ( 1 ) " "                                 
## 6  ( 1 ) " "                                 
## 7  ( 1 ) " "                                 
## 8  ( 1 ) " "                                 
## 9  ( 1 ) " "                                 
##          attributes.RestaurantsAttireNone
## 1  ( 1 ) " "                             
## 2  ( 1 ) " "                             
## 3  ( 1 ) " "                             
## 4  ( 1 ) " "                             
## 5  ( 1 ) " "                             
## 6  ( 1 ) " "                             
## 7  ( 1 ) " "                             
## 8  ( 1 ) " "                             
## 9  ( 1 ) " "

## Reordering variables and trying again:

## Subset selection object
## Call: regsubsets.formula(stars ~ attributes.NoiseLevel + attributes.RestaurantsTableService + 
##     attributes.RestaurantsTakeOut + attributes.RestaurantsPriceRange2 + 
##     attributes.OutdoorSeating + attributes.WiFi + attributes.Alcohol + 
##     attributes.RestaurantsAttire, data = food_data, method = "forward")
## 17 Variables  (and intercept)
##                                        Forced in Forced out
## attributes.NoiseLevel'loud'                FALSE      FALSE
## attributes.NoiseLevel'quiet'               FALSE      FALSE
## attributes.NoiseLevel'very_loud'           FALSE      FALSE
## attributes.RestaurantsTableServiceTrue     FALSE      FALSE
## attributes.RestaurantsTakeOutTrue          FALSE      FALSE
## attributes.RestaurantsPriceRange22         FALSE      FALSE
## attributes.RestaurantsPriceRange23         FALSE      FALSE
## attributes.RestaurantsPriceRange24         FALSE      FALSE
## attributes.OutdoorSeatingTrue              FALSE      FALSE
## attributes.WiFi'no'                        FALSE      FALSE
## attributes.WiFi'paid'                      FALSE      FALSE
## attributes.Alcohol'full_bar'               FALSE      FALSE
## attributes.Alcohol'none'                   FALSE      FALSE
## attributes.RestaurantsAttire'dressy'       FALSE      FALSE
## attributes.RestaurantsPriceRange2None      FALSE      FALSE
## attributes.RestaurantsAttire'formal'       FALSE      FALSE
## attributes.RestaurantsAttireNone           FALSE      FALSE
## 1 subsets of each size up to 9
## Selection Algorithm: forward
##          attributes.NoiseLevel'loud' attributes.NoiseLevel'quiet'
## 1  ( 1 ) " "                         " "                         
## 2  ( 1 ) "*"                         " "                         
## 3  ( 1 ) "*"                         "*"                         
## 4  ( 1 ) "*"                         "*"                         
## 5  ( 1 ) "*"                         "*"                         
## 6  ( 1 ) "*"                         "*"                         
## 7  ( 1 ) "*"                         "*"                         
## 8  ( 1 ) "*"                         "*"                         
## 9  ( 1 ) "*"                         "*"                         
##          attributes.NoiseLevel'very_loud'
## 1  ( 1 ) " "                             
## 2  ( 1 ) " "                             
## 3  ( 1 ) " "                             
## 4  ( 1 ) " "                             
## 5  ( 1 ) "*"                             
## 6  ( 1 ) "*"                             
## 7  ( 1 ) "*"                             
## 8  ( 1 ) "*"                             
## 9  ( 1 ) "*"                             
##          attributes.RestaurantsTableServiceTrue
## 1  ( 1 ) "*"                                   
## 2  ( 1 ) "*"                                   
## 3  ( 1 ) "*"                                   
## 4  ( 1 ) "*"                                   
## 5  ( 1 ) "*"                                   
## 6  ( 1 ) "*"                                   
## 7  ( 1 ) "*"                                   
## 8  ( 1 ) "*"                                   
## 9  ( 1 ) "*"                                   
##          attributes.RestaurantsTakeOutTrue
## 1  ( 1 ) " "                              
## 2  ( 1 ) " "                              
## 3  ( 1 ) " "                              
## 4  ( 1 ) " "                              
## 5  ( 1 ) " "                              
## 6  ( 1 ) " "                              
## 7  ( 1 ) " "                              
## 8  ( 1 ) " "                              
## 9  ( 1 ) " "                              
##          attributes.RestaurantsPriceRange22
## 1  ( 1 ) " "                               
## 2  ( 1 ) " "                               
## 3  ( 1 ) " "                               
## 4  ( 1 ) " "                               
## 5  ( 1 ) " "                               
## 6  ( 1 ) " "                               
## 7  ( 1 ) " "                               
## 8  ( 1 ) " "                               
## 9  ( 1 ) " "                               
##          attributes.RestaurantsPriceRange23
## 1  ( 1 ) " "                               
## 2  ( 1 ) " "                               
## 3  ( 1 ) " "                               
## 4  ( 1 ) " "                               
## 5  ( 1 ) " "                               
## 6  ( 1 ) " "                               
## 7  ( 1 ) " "                               
## 8  ( 1 ) " "                               
## 9  ( 1 ) "*"                               
##          attributes.RestaurantsPriceRange24
## 1  ( 1 ) " "                               
## 2  ( 1 ) " "                               
## 3  ( 1 ) " "                               
## 4  ( 1 ) " "                               
## 5  ( 1 ) " "                               
## 6  ( 1 ) " "                               
## 7  ( 1 ) " "                               
## 8  ( 1 ) "*"                               
## 9  ( 1 ) "*"                               
##          attributes.RestaurantsPriceRange2None
## 1  ( 1 ) " "                                  
## 2  ( 1 ) " "                                  
## 3  ( 1 ) " "                                  
## 4  ( 1 ) " "                                  
## 5  ( 1 ) " "                                  
## 6  ( 1 ) " "                                  
## 7  ( 1 ) " "                                  
## 8  ( 1 ) " "                                  
## 9  ( 1 ) " "                                  
##          attributes.OutdoorSeatingTrue attributes.WiFi'no'
## 1  ( 1 ) " "                           " "                
## 2  ( 1 ) " "                           " "                
## 3  ( 1 ) " "                           " "                
## 4  ( 1 ) "*"                           " "                
## 5  ( 1 ) "*"                           " "                
## 6  ( 1 ) "*"                           "*"                
## 7  ( 1 ) "*"                           "*"                
## 8  ( 1 ) "*"                           "*"                
## 9  ( 1 ) "*"                           "*"                
##          attributes.WiFi'paid' attributes.Alcohol'full_bar'
## 1  ( 1 ) " "                   " "                         
## 2  ( 1 ) " "                   " "                         
## 3  ( 1 ) " "                   " "                         
## 4  ( 1 ) " "                   " "                         
## 5  ( 1 ) " "                   " "                         
## 6  ( 1 ) " "                   " "                         
## 7  ( 1 ) " "                   "*"                         
## 8  ( 1 ) " "                   "*"                         
## 9  ( 1 ) " "                   "*"                         
##          attributes.Alcohol'none' attributes.RestaurantsAttire'dressy'
## 1  ( 1 ) " "                      " "                                 
## 2  ( 1 ) " "                      " "                                 
## 3  ( 1 ) " "                      " "                                 
## 4  ( 1 ) " "                      " "                                 
## 5  ( 1 ) " "                      " "                                 
## 6  ( 1 ) " "                      " "                                 
## 7  ( 1 ) " "                      " "                                 
## 8  ( 1 ) " "                      " "                                 
## 9  ( 1 ) " "                      " "                                 
##          attributes.RestaurantsAttire'formal'
## 1  ( 1 ) " "                                 
## 2  ( 1 ) " "                                 
## 3  ( 1 ) " "                                 
## 4  ( 1 ) " "                                 
## 5  ( 1 ) " "                                 
## 6  ( 1 ) " "                                 
## 7  ( 1 ) " "                                 
## 8  ( 1 ) " "                                 
## 9  ( 1 ) " "                                 
##          attributes.RestaurantsAttireNone
## 1  ( 1 ) " "                             
## 2  ( 1 ) " "                             
## 3  ( 1 ) " "                             
## 4  ( 1 ) " "                             
## 5  ( 1 ) " "                             
## 6  ( 1 ) " "                             
## 7  ( 1 ) " "                             
## 8  ( 1 ) " "                             
## 9  ( 1 ) " "

Linear Model

We now build a multiple linear regression for analyzing the effects of restaurant attributes on star ratings. By looking at the coefficients on the independent variables and the P-values, we can make inferences about the impact of various attributes on star ratings. From these results, it appears ‘Noise Levels’ are a statistically significant factor in determining restaurant ratings. Individuals appear to prefer quiet restaurants to average ones, but negatively rate loud and very loud restaurants. Restaurants with table service, take-out options and outdoor seatings also appear to have higher ratings at a statistically significant level. It is also evident from these results that restaurants with a full bar do not perform as well along Yelp ratings as restaurants with only beer and wine service. The coefficient on the variable comparing this to ‘no alcohol’ is not statistically significant, so we cannot make any inferences in this case. Restaurants with free WiFi appear to perform better than restaurants with no WiFi at a statistically significant level as well.

## 
## Call:
## lm(formula = stars ~ attributes.NoiseLevel + attributes.RestaurantsTableService + 
##     attributes.RestaurantsTakeOut + attributes.RestaurantsPriceRange2 + 
##     attributes.OutdoorSeating + attributes.WiFi + attributes.Alcohol + 
##     attributes.RestaurantsAttire, data = food_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9746 -0.4076  0.0787  0.4931  1.7556 
## 
## Coefficients:
##                                        Estimate Std. Error t value
## (Intercept)                              3.3823     0.0879   38.46
## attributes.NoiseLevel'loud'             -0.4015     0.0659   -6.09
## attributes.NoiseLevel'quiet'             0.1912     0.0406    4.71
## attributes.NoiseLevel'very_loud'        -0.6247     0.1421   -4.39
## attributes.RestaurantsTableServiceTrue   0.2778     0.0393    7.07
## attributes.RestaurantsTakeOutTrue        0.1800     0.0736    2.44
## attributes.RestaurantsPriceRange22       0.1064     0.0395    2.69
## attributes.RestaurantsPriceRange23       0.3621     0.1015    3.57
## attributes.RestaurantsPriceRange24       0.6711     0.1882    3.57
## attributes.OutdoorSeatingTrue            0.1695     0.0333    5.09
## attributes.WiFi'no'                     -0.1409     0.0314   -4.50
## attributes.WiFi'paid'                   -0.3246     0.1958   -1.66
## attributes.Alcohol'full_bar'            -0.1932     0.0467   -4.14
## attributes.Alcohol'none'                -0.0220     0.0463   -0.48
## attributes.RestaurantsAttire'dressy'    -0.0684     0.1389   -0.49
##                                        Pr(>|t|)    
## (Intercept)                             < 2e-16 ***
## attributes.NoiseLevel'loud'             1.3e-09 ***
## attributes.NoiseLevel'quiet'            2.7e-06 ***
## attributes.NoiseLevel'very_loud'        1.2e-05 ***
## attributes.RestaurantsTableServiceTrue  2.1e-12 ***
## attributes.RestaurantsTakeOutTrue       0.01461 *  
## attributes.RestaurantsPriceRange22      0.00721 ** 
## attributes.RestaurantsPriceRange23      0.00037 ***
## attributes.RestaurantsPriceRange24      0.00037 ***
## attributes.OutdoorSeatingTrue           4.0e-07 ***
## attributes.WiFi'no'                     7.4e-06 ***
## attributes.WiFi'paid'                   0.09758 .  
## attributes.Alcohol'full_bar'            3.6e-05 ***
## attributes.Alcohol'none'                0.63468    
## attributes.RestaurantsAttire'dressy'    0.62267    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.673 on 1989 degrees of freedom
##   (6929 observations deleted due to missingness)
## Multiple R-squared:  0.12,   Adjusted R-squared:  0.113 
## F-statistic: 19.3 on 14 and 1989 DF,  p-value: <2e-16

Given all the analyses done in this Exploratory Data Analysis, we have a good sense of direction to build more robust models and provide valuable insight for Las Vegas business owners!